Text Normalization Using Hybrid Approach

نویسندگان

  • Meenakshi Sharma
  • MEENAKSHI SHARMA
چکیده

Machine Translation (MT) was an important area of Natural Language Processing that dealt with the translation of one natural language to another language. In this paper we were presenting the research on Translation of short messages to Plain English Text Messages. In today’s world where communication over the internet had increased by using various types of websites and another internet applications, short messages were used most frequently for the purpose of communication. These short messages, sometimes were unable to understand by the receiving side and hence need to be translated into the plain English text so that receiver of the message could translate the actual message correctly. Our system used hybrid approach that consist of various approaches like Rule Based Approach, Statistical Machine Translation Approach and Direct Mapping Approach for the purpose of translation from Short message to Plain English Text. Translation of these short messages into plain English text was also known as Text Normalization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SMS Text Normalization Using Hybrid Approach

Text normalization is a task of generating plain text from an un normalized text. Mobile technology has contributed to the evolution of several media of communication such as chats, emails and short message service (SMS) text. This has significantly influenced the traditional standard way of expressing views from letter writing to a high-tech form of expression known as texting language. In thi...

متن کامل

IITP: Hybrid Approach for Text Normalization in Twitter

In this paper we report our work for normalization of noisy text in Twitter data. The method we propose is hybrid in nature that combines machine learning with rules. In the first step, supervised approach based on conditional random field is developed, and in the second step a set of heuristics rules is applied to the candidate wordforms for the normalization. The classifier is trained with a ...

متن کامل

Query-based Text Normalization Selection Models for Enhanced Retrieval Accuracy

Text normalization transforms words into a base form so that terms from common equivalent classes match. Traditionally, information retrieval systems employ stemming techniques to remove derivational affixes. Depluralization, the transformation of plurals into singular forms, is also used as a low-level text normalization technique to preserve more precise lexical semantics of text. Experiment ...

متن کامل

A Hybrid Approach Based on Higher Order Spectra for Clinical Recognition of Seizure and Epilepsy Using Brain Activity

Introduction: This paper proposes a reliable and efficient technique to recognize different epilepsy states, including healthy, interictal, and ictal states, using Electroencephalogram (EEG) signals. Methods: The proposed approach consists of pre-processing, feature extraction by higher order spectra, feature normalization, feature selection by genetic algorithm and ranking method, and classif...

متن کامل

A Framework for Translating SMS Messages

Short Messaging Service (SMS) has become a popular form of communication. While it is predominantly used for monolingual communication, it can be extremely useful for facilitating cross-lingual communication through statistical machine translation. In this work we present an application of statistical machine translation to SMS messages. We decouple the SMS translation task into normalization f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015